117 research outputs found

    Detecting epistasis via Markov bases

    Full text link
    Rapid research progress in genotyping techniques have allowed large genome-wide association studies. Existing methods often focus on determining associations between single loci and a specific phenotype. However, a particular phenotype is usually the result of complex relationships between multiple loci and the environment. In this paper, we describe a two-stage method for detecting epistasis by combining the traditionally used single-locus search with a search for multiway interactions. Our method is based on an extended version of Fisher's exact test. To perform this test, a Markov chain is constructed on the space of multidimensional contingency tables using the elements of a Markov basis as moves. We test our method on simulated data and compare it to a two-stage logistic regression method and to a fully Bayesian method, showing that we are able to detect the interacting loci when other methods fail to do so. Finally, we apply our method to a genome-wide data set consisting of 685 dogs and identify epistasis associated with canine hair length for four pairs of SNPs

    Packing ellipsoids with overlap

    Full text link
    The problem of packing ellipsoids of different sizes and shapes into an ellipsoidal container so as to minimize a measure of overlap between ellipsoids is considered. A bilevel optimization formulation is given, together with an algorithm for the general case and a simpler algorithm for the special case in which all ellipsoids are in fact spheres. Convergence results are proved and computational experience is described and illustrated. The motivating application - chromosome organization in the human cell nucleus - is discussed briefly, and some illustrative results are presented

    Scalable Unbalanced Optimal Transport using Generative Adversarial Networks

    Full text link
    Generative adversarial networks (GANs) are an expressive class of neural generative models with tremendous success in modeling high-dimensional continuous measures. In this paper, we present a scalable method for unbalanced optimal transport (OT) based on the generative-adversarial framework. We formulate unbalanced OT as a problem of simultaneously learning a transport map and a scaling factor that push a source measure to a target measure in a cost-optimal manner. In addition, we propose an algorithm for solving this problem based on stochastic alternating gradient updates, similar in practice to GANs. We also provide theoretical justification for this formulation, showing that it is closely related to an existing static formulation by Liero et al. (2018), and perform numerical experiments demonstrating how this methodology can be applied to population modeling

    Mastitis in dairy production: Estimation of sensitivity, specificity and disease prevalence in the absence of a gold standard

    Get PDF
    Mastitis, a worldwide endemic disease of dairy cows, is an important cause of decreased efficiency in milk production. Early medical treatment can reduce the nonreversible losses in milk production caused by this infection. Various diagnostic tests for mastitis are available, including a test measuring the electrical conductivity of milk (MEC test), the industry standard of somatic cell counting (SCC test), a bacteriological test, and a recently developed test measuring mammary associated amyloid A (MAA test). None of these tests is considered a gold standard, however. The aim of the present study was to determine which of these tests provides the best results, and at what cost, to improve the efficiency of milk production. For this study, 25 cows were tested at all four quarters of the udder with each of the aforementioned mastitis diagnostic tests. Based on the data, the disease prevalence as well as the sensitivity and the specificity of the four tests were estimated with a Bayesian approach by extending the Hui and Walter model with two independent tests and two populations to a model with four partially dependent tests and one population. This model was further combined with a receiver operating characteristics analysis to estimate the overall test accurac

    Geometry of maximum likelihood estimation in Gaussian graphical models

    Full text link
    We study maximum likelihood estimation in Gaussian graphical models from a geometric point of view. An algebraic elimination criterion allows us to find exact lower bounds on the number of observations needed to ensure that the maximum likelihood estimator (MLE) exists with probability one. This is applied to bipartite graphs, grids and colored graphs. We also study the ML degree, and we present the first instance of a graph for which the MLE exists with probability one, even when the number of observations equals the treewidth.Comment: Published in at http://dx.doi.org/10.1214/11-AOS957 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Geometry of Log-Concave Density Estimation

    Full text link
    Shape-constrained density estimation is an important topic in mathematical statistics. We focus on densities on Rd\mathbb{R}^d that are log-concave, and we study geometric properties of the maximum likelihood estimator (MLE) for weighted samples. Cule, Samworth, and Stewart showed that the logarithm of the optimal log-concave density is piecewise linear and supported on a regular subdivision of the samples. This defines a map from the space of weights to the set of regular subdivisions of the samples, i.e. the face poset of their secondary polytope. We prove that this map is surjective. In fact, every regular subdivision arises in the MLE for some set of weights with positive probability, but coarser subdivisions appear to be more likely to arise than finer ones. To quantify these results, we introduce a continuous version of the secondary polytope, whose dual we name the Samworth body. This article establishes a new link between geometric combinatorics and nonparametric statistics, and it suggests numerous open problems.Comment: 22 pages, 3 figure
    • …
    corecore